TSUBAKI: An Open Search Engine Infrastructure for Developing New Information Access Methodology

نویسندگان

  • Keiji Shinzato
  • Tomohide Shibata
  • Daisuke Kawahara
  • Chikara Hashimoto
  • Sadao Kurohashi
چکیده

As the amount of information created by human beings is explosively grown in the last decade, it is getting extremely harder to obtain necessary information by conventional information access methods. Hence, creation of drastically new technology is needed. For developing such new technology, search engine infrastructures are required. Although the existing search engine APIs can be regarded as such infrastructures, these APIs have several restrictions such as a limit on the number of API calls. To help the development of new technology, we are running an open search engine infrastructure, TSUBAKI, on a high-performance computing environment. In this paper, we describe TSUBAKI infrastructure.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Review of ranked-based and unranked-based metrics for determining the effectiveness of search engines

Purpose: Traditionally, there have many metrics for evaluating the search engine, nevertheless various researchers’ proposed new metrics in recent years. Aware of this new metrics is essential to conduct research on evaluation of the search engine field. So, the purpose of this study was to provide an analysis of important and new metrics for evaluating the search engines. Methodology: This is ...

متن کامل

A Cache Design of SSD-based Search Engine Architectures: An Experimental Study

Caching is an important optimization in search engine architectures. Existing caching techniques for search engine optimization are mostly biased towards the reduction of random accesses to disks, because random accesses are known to be much more expensive than sequential accesses in traditional magnetic hard disk drive (HDD). Recently, solid state drive (SSD) has emerged as a new kind of secon...

متن کامل

BnO at the NTCIR-11 English Fact Validation Task

This paper describes the submission of BnO team to the RITE-VAL Fact Validation task [11] for English in NTCIR11. In this submission, the BnO team formulated the fact validation as a textual entailment task, where the objective is to find a piece of text from a corpus such that it entails the stated fact. For that purpose, BnO team made use of search results retrieved by the search engine TSUBA...

متن کامل

Towards Supporting Exploratory Search over the Arabic Web Content: The Case of ArabXplore

Due to the huge amount of data published on the Web, the Web search process has become more difficult, and it is sometimes hard to get the expected results, especially when the users are less certain about their information needs. Several efforts have been proposed to support exploratory search on the web by using query expansion, faceted search, or supplementary information extracted from exte...

متن کامل

Open access dissemination challenges: a case study

PurposeThis paper explores dissemination, broadly considered, of an open access database as part of a librarian-faculty collaboration currently in progress. Design/methodology/approachDissemination of an online database by librarians is broadly considered, including metadata optimization for multiple access points and user notification methods. FindingsLibrarians address open access disseminati...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008